Dimension reduction for model-based clustering via mixtures of multivariate $$t$$ t -distributions

نویسندگان

  • Katherine Morris
  • Paul D. McNicholas
  • Luca Scrucca
چکیده

Dimension Reduction for Model-Based Clustering via Mixtures of Multivariate t-Distributions Katherine Morris Advisor University of Guelph, 2012 Prof. Paul D. McNicholas We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of t-distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering how much group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Cluster Analysis via Mixture Models

Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster data sets. In this paper, we focus on the use of normal mixture models to cluster data sets of continuous multivariate data. As normality based methods of estimation are not robust, we review the use of t component distributions. With the t mixture model-based approa...

متن کامل

Mixtures of common t-factor analyzers for clustering high-dimensional microarray data

MOTIVATION Mixtures of factor analyzers enable model-based clustering to be undertaken for high-dimensional microarray data, where the number of observations n is small relative to the number of genes p. Moreover, when the number of clusters is not small, for example, where there are several different types of cancer, there may be the need to reduce further the number of parameters in the speci...

متن کامل

Robust Fuzzy Classification Maximum Likelihood Clustering with Multivariate t-Distributions

Mixtures of distributions have been used as probability models for clustering data. Classification maximum likelihood (CML) procedure is a popular mixture of maximum likelihood approach to clustering. Yang (1993) extended CML to fuzzy CML (FCML) for a normal mixture model, called FCML-N. However, normal distributions are not robust for outliers. In general, t-distributions should be more robust...

متن کامل

Rejoinder to the discussion of "Model-based clustering and classification with non-normal mixture distributions"

Non-normal mixture distributions have received increasing attention in recent years. Finite mixtures of multivariate skew-symmetric distributions, in particular, the skew normal and skew t-mixture models, are emerging as promising extensions to the traditional normal and t-mixture models. Most of these parametric families of skew distributions are closely related, and can be classified into fou...

متن کامل

Location and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering

The family of location and scale mixtures of Gaussians has the ability to generate a number of flexible distributional forms. The family nests as particular cases several important asymmetric distributions like the Generalised Hyperbolic distribution. The Generalised Hyperbolic distribution in turn nests many other well known distributions such as the Normal Inverse Gaussian. In a multivariate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Adv. Data Analysis and Classification

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2013